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Abstract - We present a stochastic evolutionary model obtained through a per- 
turbation of Kauffman's maximally rugged model, which is recovered as a special 
case. Our main results are: (i) existence of a percolation-like phase transition 
in the finite phase space case; (ii) existence of non self-averaging effects in the 
thermodynamic limit. Lack of self-averaging emerges from a fragmentation of the 
space of all possible evolutions, analogous to that of a geometrically broken object. 
Thus the model turns out to be exactly solvable in the thermodynamic limit. 



1 Introduction 

We present here the analytic study of a model of an abstract behaviour with 
frustrated rationality. The model, despite or because of its ingenuity, has 
revealed interesting statistical properties, such as a percolative phase tran- 
sition in the finite dimensional case and non self-averaging effects in the 
thermodynamic limit. Our starting point is Kauffman's well known NK 
model of biological evolution (see Ref. in its maximally rugged version 
[K = N — 1), whose properties have been extensively investigated. Yet in 
this paper it serves as a metaphoric abstract model for the behaviour of a 
fully rational adaptive walker who moves in its phase space in search for an 
optimal configuration. We decided to perturb its stringent rationality by in- 
troducing in the evolutionary rule a probability p, as a measure of a certain 
degree of insanity (or frustration or disorder). For p = 1 we recover the 
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original model, whereas for p = we have a random walker in configuration 
space. The introduction of p is fatal for the adaptive behaviour, but leads to 
a percolation-like phase transition, that separates a phase characterized by 
finite walks to optima from one in which the probability of an interminable 
walk is non zero. We show that in a large configuration space a small per- 
turbation is sufficient to get the percolation threshold. The thermodynamic 
limit is obtained by letting the cardinality of the phase space go to infinity. 
We argue that this leads to an infinite number of different possible evolu- 
tions. Nevertheless, in this limit we show that the probability Y that two 
walkers undergo "similar" (in a sense that will become clear later) evolutions 
has non zero average and a finite variance, that is it lacks of self- averaging. 
This property will result from a fragmentation of the space of all possible 
evolutions analogous to that of a geometrically broken object (Ref. 0). 

Evolutionary models have become quite familiar to theoretical physicists, 
and many of them have been carefully examined (see Refs. p, Q for a re- 
view) . This for two main reasons, (i) Species undergoing biological evolution 
are dynamical systems, in the sense that their configuration varies with time 
according to some modelizable dynamical law. The dynamics draws a trajec- 
tory in the system's phase space, that is the set of all possible configurations, 
(ii) Biological evolution is a complex phenomenon, since one must assume 
that each step of it derives from and is infiuenced by the concurrence of 
different factors, which may be altogether taken into account as a number 
of random variables that give the system's trajectory unpredictability and 
stochasticity. Hence, many ideas taken from the theory of disordered sys- 
tems, the main of which is that of landscape, may be fruitfully applied for 
the construction and the study of these seemingly different types of models 
(see Ref. for a detailed overview). 

One assumes that an evolving species (the system) may be found in any 
of a number of configurations, representing its genome. This is taken for 
simplicity to be a finite set of spin variables Si {Si = ±1, 1 < i < N). 
The phase space F is the set {+1, —1}^ of all genomes. The metric in F is 
typically the Hamming distance dn- Feature (i) is recovered by giving some 
evolutionary algorithm F such that 

Ct+i = F{Ct), (1) 

where Ct is the system's configuration at time t and time is a positive integer 
or zero. Feature (ii) is introduced through the concept of landscape. For our 
purposes, a landscape is a pair (F,(^), F being the system's phase space 
being a real valued function : F ^ M called fitness, defined for all C G F. 
The idea underlying biological evolutionary models is that F lets the system 
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evolve through configurations of growing fitness in search for an optimal one. 
This optimization procedure is usually not global, that is the system does not 
seek for the fittest configuration in F; optimal configurations are considered 
those a; e r such that (j){x) > 4>{y), for all y E T such that du^x^y) = 1. 
These are called "local optima" . 

Of course, the complexity arises from the difficulty in finding the local 
optima, or, if one wants, from the specific form of 0, which may eventually 
depend on t. The more rugged the landscape, namely the higher the number 
of maxima and minima of 0, the more complex the dynamics. 

In Kauffman's original idea the fitness of each configuration resulted from 
epistatic interactions between K of its genes. An increase of K implied 
an increase of the number of local fitness optima. This way of tuning the 
landscape's complexity is equivalent to the following, which may sound more 
familiar to physicists (see Ref. for an overview of the contact points 
between spin glass physics and biology). One assumes that the fitness of a 
configuration C is given by a K-spin type of hamiltonian, 

N 

(l>{C)= Jiu-,iKStr--StK, (2) 

11,... ,iK=i 

where C = {Si,... ,Sj\f} and Ji^^,,,^ij^ are gaussian random variables, with 
K < N . It is possible to show (see Refs. |]^, |^ for details) that as the pa- 
rameter K varies from 1 to iV the landscape's ruggedness grows accordingly, 
since correlations between the fitness' values of neighbouring configurations 
(configurations x and y such that dH{x,y) = 1) decrease. Therefore for large 
K, that implies large A^, one finds that the probability P(0i,02) that two 
configurations Ci and C2 have fitnesses 0i = 0(Ci) and 02 = 0(^2) respec- 
tively factorizes: 

P(01,02)^P(0l)P(02). (3) 

For all practical purposes behaves thus as a random variable. This is 
Kauffman's maximally rugged landscape, which is equivalent to Derrida's 
random energy model (again Refs. [0, P])- 

This paper is organized as follows. In Section 2 we give an account of 
Kauffman's maximally rugged model, with its main statistical properties. 
Though much of the material of Section 2 are well-established results, we 
added them to this paper both to make it self-consistent and to emphasize 
how the perturbation acts on the system. Section 3 contains the definition 
of the perturbed model and its analytic study in the finite phase space case. 
In Section 4 we study the thermodynamic limit. In the final section we make 
some comments on our results and formulate the conclusions. 
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2 KaufFman's maximally rugged model 



Kauffman's maximally rugged model is defined as follows: the system may 
take on any of the 2^ configurations of the phase space T = {+1, — 1}^, 
and large N is assumed. The fitness is a quenched random variable whose 
probability density is, say, p{4>). The dynamics F is then defined as a zero 
temperature Monte Carlo algorithm: 

1. at time t > the system is in configuration Ct = {Si, . . . ,Sn} with 
fitness 

2. a spin Si of Ct is chosen at random and its sign is changed, thus ob- 
taining a configuration C that differs from Ct by just the i-th spin 
{l<i< N); 

3. if 0(C') > (f){Ct) then Ct+i = C'\ otherwise Ct+i = Ct and return to (ii). 

In a rough biological interpretation, this models a situation in which a 
species evolves increasing its fitness by random point mutations. Trajectories 
come to an end when the system is in a local fitness maximum, because it 
cannot find any fitter neighbour. F is a stochastic dynamics that takes the 
system to such optima passing through configurations of increasing fitness 
that are just one spin different from one another. The trajectories are usually 
called adaptive walks, and their length is strictly related to the local proper- 
ties of the fitness landscape. These have been analytically studied (see Refs. 
in and 0), revealing a generous structure of very numerous maxima, as we 
shall soon recall. 

Before coming to that, we would like to stress that in what follows we 
shall consider two types of averages. The first one, which we shall call a 
"quenched" average, will be denoted by a bar (^^) and indicates averages 
over all possible fitness realizations. Suppose to be given a certain quantity 
q (for instance, the number of fitness maxima) that may take on different 
values in different realizations of 0. The average of q over all possible fitness 
samplings will be written q. This notation is slightly unusual since generally 
this type of average is denoted by brackets. Instead, the second one will be 
denoted here by brackets ((...)) and will define averages over many differ- 
ent evolutions. For instance, we shall deal with the average length {£) of 
an adaptive walk. This could be written {£) = J idQ{i), where Q{i) is the 
probability that an adaptive walk consists of i steps. In principle it may be 
difficult to obtain analytical information about the probabilities Q{x) {x = 
length, duration, ... ). This average may nevertheless be estimated as fol- 
lows: one fixes the landscape and averages the lengths of many walks with 
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the same starting point, which by assumption will be the least fit configu- 
ration in r. The ensemble in which averages are calculated are thus that of 
all possible landscapes on T for and that of all possible evolutions (for 
example, adaptive walks) for (...). 

We begin by proving that the average number of local optima increases 
exponentially with N. 

Result 1. Let f{N) denote the fraction of local fitness optima in F in a 
given fitness realization (landscape). We have 

7(iV) = ^. (4) 

The proof is straightforward: let y{(j)) denote the probabihty that a given 
configuration has a lower fitness than 0, namely y(0) = p{(f)')d(f)' . Since 
for a local optimum of fitness the N neighbouring configurations must have 
lower fitness, we have that f{N) is the average of over all possible choices 
of y. The probability density q{y) of y is uniform, hence 

7W = (5) 

On the average there are thus 2^/ (N + 1) local optima, so that their 
number grows exponentially with A^. Now consider making i steps of an 
adaptive walk starting from the least fit configuration in F. One finds that 
on the average the probability to take a further step, namely the fraction of 
fitter neighbours, is halved each time a step is taken. 

Result 2. Let F{i) denote the fraction of fitter neighbours after £ steps. We 
have 

im) = 2-^ (6) 

Indeed, an adaptive walk can be seen as a sequence of increasing but in- 
dependent values of (p. If we consider y{(j)) instead of 4>, an adaptive walk 
becomes a sequence of increasing values of y, which is, as said above, a ran- 
dom variable with uniform probability density on the [0, 1] interval. For one 
walk of £ steps, namely for one increasing sequence of £ independent values 
yi, ... ,ye of y, we can write the probability to find an {£ + l)-th value of y 
greater than all of the previous £ as 

F{£) = P{y2 > yi)P{ys > ya) • • • P{ye+i > ye): (7) 
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where P{yn > Hm) denotes the probabihty of samphng a value Um of y greater 
than ?/„. {F{£)) may be obtained by averaging over all possible samplings of 
2/1, •• • ,yi- 

{F{i)) = {Piy^ > 2/i)P(2/3 > 2/2) • --Piyw > ye)). (8) 

Clearly, P(?/„+i >?/„) = !- J^" q{y)dy = 1 - y„, because q{y) is uniform. 
Hence 

W)) = ((l-2/i)---(l-2/.))- (9) 
The statistical independence of yi, . . . ,?/^ implies that 

W)) = ((l-2/i))---((l-2/.)) = ((l-2/))'- (10) 

Now it's simply ((1 — y)) = (1 — y)dy = 1/2 hence the result follows. 

Let us now turn to the study of the statistical properties of adaptive 
walks. The two major outcomes are concerned with the average length of 
an adaptive walk, which represents the average number of configurations the 
system has assumed from its starting one to a local fitness maximum, and 
with the average duration of an adaptive walk, namely the total number of 
tried mutations, those accepted and those refused. 

Result 3. Let {i{N)) and {t{N)) denote, respectively, the average length 
and the average duration of an adaptive walk. If ^ 1 we have 

1. (£(iV))~log2iV; 

2. {t{N)) ~ A^. 

1. is an estimate for {i{N)). It is obtained through the consideration that 
an adaptive walk ends when the fraction F{i) of fitter neighbours falls below 
Hence the average length is that for which {F{i)) ~ ^. From Result 2 
one soon gets 

2-' - ^, (11) 
whence the estimate {i{N)) ~ log2 N follows. A more rigorous though much 



more complicated estimate has been derived in Ref. ||I0[, where it is shown 
that {i{N)) ~ log A^ with a proportionality constant that is slightly different 
from (log 2)^^. Therefore it is reasonable to take 1. as a fairly good estimate. 

For what concerns 2., we consider that since the fraction of fitter neigh- 
bours is halved on the average at each step, then the waiting time (if one 
wants, the number of tried and refused mutations) doubles on the average 
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at each step. So the average number of time units one has to wait in order 
to take the i-th step is 2^~^ (one has to wait a time 1 to take the first step 
because by assumption each walk starts from the least fit configuration in 
r). We obtain {t{N)) by summing all waiting times in each configuration 
passed by in an adaptive walk, the average number of which is given by 1.; 
hence 



where the sum has been performed as if logg were an integer. For large 
2. is recovered. Again, in Ref. it is shown that a more rigorous derivation 
of {t{N)) yields the same result {t{N)) ~ up to a proportionality constant 
that is just slightly different from 1. Therefore 2. may well be considered a 
good estimate. 



In the previous section we have recalled the statistical properties of Kauff- 
man's maximally rugged model. Following its dynamical rule F the system 
can evolve only through fitter configurations. In some sense, looking back at 
spin glasses, one could say that it lacks of frustration. The system always 
does the right thing, always finds its way in the rugged landscape, in a fi- 
nite number of steps reaches a fitness maximum, and that's it; failures are 
ruled out. In our perturbed version of this model we want to frustrate the 
rationality of the system with an additive selective pressure p, acting as a 
constraint on the system's optimizing ability. 

We thus consider a system whose phase space is F = {+1, —1}^, evolving 
in a landscape where the fitness is a quenched random variable. The law 
of evolution Fp depends on a real parameter p G [0, 1] through the following 
definition: 

1. at time t > the system is in configuration Ct = {Si, . . . , S^} with 
fitness 4>{Ct); 

2. a spin Si of Ct is chosen at random and its sign is changed, thus ob- 
taining a configuration C that differs from Ct by just the i-th spin 



3. if 0(C') > (f){Ct) then, with probability p, Ct+i = C and, with prob- 
ability 1 — p, Ct+i is chosen at random among the neighbouring 
configurations of Cf, 




(12) 



3 Perturbing Kauffman's model 



(1 < z < N)- 
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4. if (f){C') < (l){Ct), then Ct+i = Ct and return to (ii). 

The landscape's statistical properties are the same as those of Kauffman's 
model, so that Result 1 still holds. The difference with the original model is 
that this time the system accepts a favourable mutation only with a proba- 
bility p. If it can not, then it is forced to choose a random spin and change 
its sign, regardless of the fitness of this newly-obtained configuration. By 
this we mean to model a system that undergoes an external evolutionary 
pressure, whose strength increases with p varying from 1 to 0, as it evolves 
in a rugged landscape. The pressure is a perturbation of the dynamics, such 
that the case p — 1 corresponds to the unperturbed model. We'll see that a 
small perturbation is sufficient to drastically change all statistical properties 
of the model. For example, the average length of a trajectory, which we shall 
call a p-walk, diverges. 

Let us consider the case of finite N. We begin by deriving the analogous 
for the perturbed model of Result 2 for the unperturbed one. 

Result 4. Let Fp{i) denote the fraction of fitter neighbours after £ steps of 
a p-walk and let {Fp{e)) = {F{e))p. We have 

™>'=2^((l)' + ^-4 (^3) 

One sees that in the p — > 1 limit. Result 2 is recovered. 

The proof is not difficult but tedious. Observe that at each mutation the 
system makes a choice between two symbols: p and 1 — p. Let Q denote the 
set of all possible sequences of choices the system can make in a p-walk of 
given length £, namely n = {p, 1 — pY- One can think of a p-walk a; of ^ 
steps as an element of VL of the form 

u = {ui, . . . (14) 

where Uj G {p, 1 — p} for I < j < i. We shall call fl the "space of p- walks" . 
Considering that when the system accepts a positive mutation the average 
fraction of fitter neighbours is halved, one can construct a partition of Q, 
made by subsets of "similar" p-walks: 

1. the first subset Jli contains those p-walks such that o;^ = 1 — p; 

2. the second subset ^2 contains those p-walks such that a;^_i = 1 — p and 

3. the k-th subset {I < k < i — 1) Qk contains those p-walks such that 
LVi-k+i = 1 - P and u!i_k+2, ■■■ ,^e=P] 
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4. the i-th subset contains the p-walks {1 — p,p, . . . ,p} e {p, . . . ,p}. 

We shall call "types" of p- walks the subsets flm {m = 1, . . . so that a 
p-walk cu e Qrn is a p-walk of the m-th type. The similarity consists of the 
fact that all p- walks of the m-th type are such that, on the average, after the 
£-th step there is a fraction of 2~"* fitter neighbours than the configuration 
reached by the system. This is so because this average fraction depends 
on how many p-steps (steps in which the mutation has been accepted) the 
system has made since the last (1 — p)-step. In fact, a (1 — p)-step brings 
the system to a configuration having, on the average, a fraction of 1/2 fitter 
neighbours (if N is sufficiently large) and each p-stcp following halves this 
fraction. For example, if n — 1 p-stcps arc taken after a (1— p)-stcp (namely if 
a p-walk of the n-th subset is made), the average fraction of fitter neighbours 
will be 2~". So the probability to have 2~"* • N fitter neighbours after £ steps 
equals the probabihty P(ilm) to take a p-walk of the m-th type. This is 
easily calculated: the probability P{u!) that the p-walk cu — {cui, . . . is 
made is simply 

e 

i=l 

and thus 

A straightforward calculation shows that 

pm=p'-' (17) 

p{n,) = {1 - p)p'-' (18) 

with 1 < k < i — 1. Hence {F(i))p may be derived from the formula 

e 

{m)p = E P(^rn)2-"'^ (19) 
m=l 

that makes use of the fact that the average fraction of fitter neighbours is 
2^™ with probability P{Qrn) (namely, when the walk done is of the m-th 
type), and of the fact that, clearly, 

J2 ^(^-) = 1- (20) 

m=l 

9 



(15) 



(16) 



We rewrite formula (|T9|) explicitly: 

1-2 

{F{e))p = p'-'2-' + ^(1 - p)p'=2-(^+i). (21) 

k=0 

Performing the sum and with a minor rearrangement of the terms Result 4 
is obtained. 

Result 4 is the starting point for deriving an estimate for the average 
length {i{N))p of a p-walk. It is sufficient to consider that on the average 
the walk stops when {F{i))p falls below the value that is, when there are 
no fitter neighbours. Hence the stopping condition reads 



1 

N 

which leads to 



{m)p - -r^ (22) 



Isolating i from the previous formula is a simple task and one obtains 

m))p ^ logH ((AT - l)p - (AT - 2))] . (24) 

One sees that in the p ^ 1 limit the average length of an adaptive walk is 
recovered: 

{i{N))^ = {£{N))c^\og,N. (25) 

Formula ( |2^ ) may be put in a more fashionable way as is shown by the 
following result, analogous to Result 3 of the unperturbed model. 

Result 5. Let {i{N))p and {t{N))p denote, respectively, the average length 
and duration of a p-walk. There exists an A^-dependent number pc G 
]0, 1[ such that the following estimates hold: 



1. 



log 



j^{{p-Pc{N)){N-l)) 



m))p ^ — r-r-p (26) 



log I 
2. 



" T^p ( " - P 1 - 2p ) ■ ^^^^ 
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Let us rewrite formula (pi]) in the form 



log i((iV-l)p-(iV-2)) 



logf 



:28i 



This must by definition be a positive number, though it may not be an 
integer. But since its denominator is negative, so has to be its numerator. 
But this only holds if 

0<l((iV-l)p-(iV-2)) <1. (29) 

The right side inequality leads to a condition for p that is always satisfied; 
the left inequality leads on the contrary to the requirement that 

p > p,(iV) = 1 - (30) 



Minor rearrangements of the terms in formula ( P4|) lead thus to part 1. of 
Result 5. 

We see now from formula (|26|) that the average length of a p-walk diverges 
as p Pc{N)^. Note that the critical threshold Pc depends on the dimension 
of the phase space. Also, it is simple to check that, for all N, 

lim , ^5^^^^ ^ =P<0. (31) 

P^pt log(p - Pc) 

This means that when we approach the critical point from above the average 
length diverges as 

m)), ^ \og{p-p,)f' = -\p\log{p-p,). (32) 

Let us turn to time. It is clear that a p-walk of infinite length is also of 
infinite duration. Reminding that one unit of time corresponds to a trial spin 
flip and fitness check in our model, let us derive an expression for the average 
duration {t{N))p of a p-walk. The strategy we wish to adopt is the following: 
since the system remains for a certain amount of time in each configuration it 
visits, during which it tries point mutations to find a fitter neighbour, we can 
think of evaluating the average time spent in a configuration, and sum over 
all configurations visited during ap-walk, that on the average are {i{N))p. In 
the case of an adaptive walk everything's simple: since the fraction of fitter 
neighbours is halved on the average at each step, the waiting time is doubled 
on the average, so that in order to take the i-th step it is necessary to wait a 
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time {t{€)) = 2^^^ on the average (the average waiting time to take the first 
step is one since all neighbours are fitter by assumption, and so on). 

Hence, when the average fraction of fitter neighbours is 2~^ the time 
required on the average to find a fitter one mutant configuration is 2^~^. We 
thus could estimate the average waiting time (r(£))p to take the i-th step 
of a p-walk as we did estimate {F{i))p, namely by formula (p!9D, just by 
substituting all average fractions of fitter neighbours 2~^, . . . , 2^^ with the 
corresponding average waiting times 2^~^, ... ,1. We have 

M^))p = E^(^™)2'""'' (33) 

m=l 

using the probabilities (p!7D we obtain immediately 

{T{e)), = p'-'2'-' + (1 - p) ^/2^ (34) 

Performing the sum we finally arrive at 

{T{i)), = ^^[l-p-pi2pY~'), (35) 

which is what we were looking for. One sees that in the no-perturbation 
limit p — > 1 this result leads to (r(£))i = 2^~^, as we expected. Furthermore, 
note that for i = 1 we get (r(l))p = 1 for every p, which is right since in this 
model also the time needed to take the first step is one. 

Summing over all configurations transversed during a p-walk on the av- 
erage we obtain an estimate for the average duration {t{N))p: 

The last equality holds by virtue of the fact that in formula (pS]) the depen- 
dence on £ is only in the term {2pY~^. Therefore we may redefine i (ranging 
from 1 to {i{N))p) as i — l, and this new variable varies from to {i{N))p — l. 
After a minor rearrangement of the terms we see we can split the sum in two 
sums: 

(t(A^))p ^ E (1-^')-^' E (2p)' • (37) 
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Now perform the sums under the assumption that {i{N))p is an integer (we 
are interested in an estimate; if the average length were not an integer, we 
would get an estimate by summing up to [(^(A^))pJ): 

1 / 1 - f2»)<^(^)>''\ 
{t{N))p ^ (^^ " - P 1 2p ) ' ^^^^ 

which is the estimated average duration of a p-walk. Note that in the p ^ 1 
limit, where {i{N))p \0g2N, one recovers 

{t{N))^^{t{N))c^N~l, (39) 

that in the large N limit is what one gets from Kauffman's maximally rugged 
model. 

We have thus discovered that if p > Pc{N) then the average length of 
a p-wa\k in the rugged random landscape is finite, whereas at p = Pc{N) 
{i{N))p diverges. We emphasize that this picture is qualitatively correct, 
despite of the fact that formula is an estimate for {i{N)). The critical 
probability Pc{N) depends on the phase space's dimension A^. If is large, 
as we have assumed to derive these formulas, then it is close to one. Thus a 
small perturbation, which means a value of p which is just slightly different 
from 1, is sufficient to switch on the probability that the system wanders 
through the rugged landscape indefinitely. 

In effect, we can render these observations more quantitative. 

Result 6. Let Qp{i) denote the probability that a p-walk consists of i steps. 
We have 

n /= ° for p>p,{Ny, 
I > for p < Pc\N). 

Of course we assume the validity of the normalization condition 

00 

gp(oo) + J]gp(£) = i, (41) 
1=1 

that should hold for all p G [0, 1] and where we have separated the term 
corresponding to £ = 00. To prove Result 6 it is sufficient to put {£{N))p in 
the form 

00 

{i{N))p = ooQp(oo) + 5^£Qp(£), (42) 
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and consider that this average value is finite whenever p > pc{N), and infinite 
otherwise. 

We have mutuated this fancy way of writing this average value from 
percolation theory, where the average number of lattice points in a cluster 



{n)p is written as (see for example Ref. |jTT[, where {n)p = xip)) 



{n)p = ooPp(oo) + J2 nPpi^)^ (43) 

n=l 

where Pp{n) represents the probability that a cluster contains exactly n 
points. Equations (^) and (^) are not very satisfactory from a notation 
point of view, since the quantity oo is treated like a number. Anyhow, 
equation (^Op indicates that this p-walks' model displays a percolation-like 
transistion, that is just analogous to the one described by 

PA^){^1 'ri"2 (44) 

> for p > Pc{N), 

which characterises "classical" percolation theory, where indicates the di- 
mensionality of the lattice one is considering. In the large N limit of the 
p- walks' model the percolation threshold is close to 1 (see formula (0)), so 
that a small perturbation is enough to turn on the probability of no arrest. 



4 The thermodynamic hmit 

The thermodynamic limit is obtained by letting the dimension N of the 
phase space F = 1}^ to infinity. In this limit the average length of 

an adaptive walk diverges logarithmically as stated by Result 3. Hence, all 
p-walks are interminable. It is therefore natural to consider, together with 
the N oo limit for F, the £ — * oo limit for Q = {p,l — pY. 

In the previous section we have seen that for finite i Q may be fragmented 
into i subsets which we called "types" of p- walks. All walks of the same type, 
say the m-th {I < m < i), are such that, on the average, after i steps the 
fraction of fitter neighbours or, if one wants, the probability to take one 
further step is 2"™. We have denoted by Q^n the m-th type and by P(fim) 
the probability that a p-walk is of the m-th type. We have then found that 
Pi^k) = (1 - p)p''~^ for = 1, 1, and that Pi^g) = p^-\ When 

i goes to infinity the number of fragments in which the fl space is broken 
diverges, but still each fragment retains the same meaning, for the probability 
that a certain p-walk is of a given type does not change if the number of 
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types diverges. For example, the probability that a p-walker finds half of 
his neighbours fitter than him after £ steps is always 1 — p for all I. What 
happens is just that when the walker takes an + l)-th step an additional 
type (the (£ + l)-th) must be taken into account. But its probability PiVLi+i) 
causes a change in P[Vt(), whereas the probabilities of the remaining types 
are unchanged. 

Hence, f2's thermodynamic limit may be thought of as if it were con- 
structed as follows. Take Vl as an object of size P{^) = 1 and suppose to 
break it into infinite pieces fli, fl2, ... of sizes Wi = P{Qi), W2 = P{Q2), ■ ■ ■ 
respectively. The breaking process depends on a given real number p G [0, 1]. 
First, we tear Q in two pieces of sizes Wi = 1 — p and p. Then we take the 
latter and tear it in two pieces of sizes W2 = (1 — p)p and p^. Thirdly, we 
take the one of size and break it in two pieces of sizes W3 = (1 —p)p'^ and 
p^. In principle, one may continue breaking the pieces of sizes p^^^ at the 
£-th step and take the I ^ 00 limit. In the end we have an infinite set of 
pieces of sizes 

Wi = l- 
W2 = (1 

Ws = {l 

The sizes Wg represent the probability that a p-walk is of the s-th type in 
the thermodynamic limit. Clearly, Y2s^s = — p)p^~^ = 1. In Ref. 

we called this a geometrically broken object, since the sizes of the resulting 
pieces form a geometric sequence. In fact, Ws+i = pWs for s = 1, 2, . . . . 

Now suppose to be given a certain number of p-walkers, each of which 
chooses his value of p from a given probability density p{p) on the [0, 1] 
interval. To each of these will correspond a specific rupture of the space fl 
of p-walks, since for the geometrical breaking the weights of the types Ws 
depend just on the value of p. Hence each p- walker gives a breaking sample of 
Q. This picture is quite usual in the theory of disordered systems, where one 
deals with systems having a quenched disorder represented by a number of 
stationary random variables. For each sample, namely for each choice of the 
quenched disorder, certain statistical or thermodynamic extensive obsevables 
X (for example, the free energy density) may be evaluated. One is usually 
interested in averaging X over disorder, i.e. over all possible samplings of 
the quenched random variables. The most interesting outcome in many cases 
is that non self- averaging effects are present: sample-to-sample fluctuations 
of X do not vanish in the thermodynamic limit (i.e. when one lets the size 



-p 
-p)p 

(45) 



15 



of the sistem go to infinity). This means that {X) (the average of X over 
disorder) is finite and that var(X) = (X^) — (X)^ is non zero. The probabihty 
density P{X) of X remains "broad" in the thermodynamic hmit, whereas for 
a self-averaging quantity the probabihty density in the same hmit is highly 
concentrated around its average. As a result, the value of a self- averaging 
quantity on a sufficiently large sample is a good estimate of the ensemble 
average, while for non self-averaging quantities no sample, no matter how 
large, is a good representative of the whole ensemble. 

More specifically, in model broken objects as the randomly broken object 
T2[ as well as in other more complicated models (see Ref. |T^ for a unifying 



review) one finds that the sizes Wg of the pieces lack of self-averaging. In all 
of these the thermodynamic limit is obtained by letting the number of pieces 
go to infinity. The study of non self-averaging properties of a geometrically 
broken object is the content of Ref. 0. The model turns out to be exactly 
solvable. 

We consider in each sample the probability 



Y = J2W^ (46) 



that two randomly chosen walks in Q are of the same type. The aim is to 
show that y's ensemble average (Y) over disorder (that is, over p) is non 
zero and that F's variance var(y) = {Y"^) — (Y)"^ does not vanish. This 
would yield the conclusion that the probabilities Ws of the types are non 
self-averaging quantities. Among other results, in Ref. we have proved 
that 

1. the probability density Il{Y) of Y over all possible samples of a geo- 
metrically broken object is given in the thermodynamic limit by 

nW = ^p([^). (47) 

2. Assuming p{p) = 1 the ensemble average of Y is given by 

{Y)=[ Fn(F)rfF = log4- 1 ^ 0.386... . (48) 
Jo 

3. Under the same assumption one can calculate the second moment (Y"^) 
of Y and show that the variance is given by 

var(r) = (Y^) - {Yf ~ 0.078 .... (49) 
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We thus come to the interesting conclusion that in the thermodynamic 
Umit of the p-walks' model non self-averaging effects are present: the prob- 
abilities that a p-walk is of a given type lack of self- averaging (i.e. they 
remain sample dependent). In other terms, the probability Y that two p- 
walkers with same p make walks of the same type has non zero average and 
finite variance, despite of the fact that there are infinite different types. 

Let us now turn to a different problem. Consider two p-walkers with 
freedom parameters pi and p2 respectively. We know that for each of them the 
probability that a p-walk is of the s-th type is given by Wgipi) = (1 —Pi)Pi~^, 
for i = 1,2. Let us define the variable 

Z = Z{p,,p2) = ^ W^.(pi)W^.(p2), (50) 

s 

giving the probability that a randomly chosen pi-walk and a randomly chosen 
P2-walk in ft are of the same type. The ensemble average {Z) has to be 
evaluated over all possible choices of pi and p2- (Z) has some resemblence 
with a correlation function in the space fl of p-walks. We shall now prove 
that it is possible to calculate the probability density ^{Z) of Z, such that 
the probability that for a given choice of pi and p2 Z is in the [Z, Z + dZ] 
interval is given by ^{Z)dZ. More precisely we prove the following: 

Result 7. If both pi and p2 are chosen from a uniform probability density 
on the [0, 1] interval, then the probabihty density ^{Z) of Z is given 

by 

= OTW ~ ~ 2^ ^) ■ ^^^^ 

This allows us to evaluate (Z) and var(Z) = (Z^) — {Z)'^. One finds that 

(Z) = [ Z^Z)dZ ~ 0.289867 . . . (52) 
Jo 

(Z^) = / Z2$(Z)ciZ~ 0.130395... (53) 
Jo 

var(Z) ~ 0.04637... (54) 

This tells us that, like Y, Z is non self-averaging. But it also tells us that 
the values of Z are more concentrated around its average than those of Y, 
at least for uniform p, since var(Z) is smaller than var(y). 

We begin by calculating Z for two given values of pi and p2- One has 

oo 

Z = Y,^l-Pl)p\-\1-P2)pr^ = (1 -Pl)(l -P2) 5^(PlP2)'-\ (55) 
s=l s 
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whence 



^(Pi,P2) = • (56) 

1 - V\V2 

For simplicity of notation set pi = a; and p2 = V- Let 

C(a;,y) = ^ (57) 

\ — xy 

and define the region D{Z) C [0, 1]^ 

Z}(Z) = {(x, y) G [0, 1]2 : (C(x, y) < Z) A (Z G [0, 1])}. (58) 

Suppose that x and y are random variables with probabihty distributions 
p{x) and p{y)- The probability P{^(x, y) < 2'} = F{Z) is simply 



F(Z) = // p{x)p{y)dxdy. (59) 

J JZ)(Z) 

F(Z) and ^{Z) are related by 



therefore calculating -F(2') is the crucial step towards ^{Z). Suppose for 
simplicity that p = 1, so that F{Z) is the area of D{Z). From definition 
(|57|) we see that for x = C(0,y) = 1 — Hence the curve Ci^yV) = Z 
touches the y axis in the point yQ = 1 — Z. We thus construct the rectangle 
R{Z) = [yQ, 1] X [0, 1] (as shown in Figure [^) and note that it is contained in 
D{Z). F{Z) may thus be separated as 



F{Z) = / / dxdy + / / dxdy. (61) 

JJr{Z) JJd{Z)\R{Z) 

The first integral is equal to the area of R{Z), that is Z. For what concerns 
the second integral, we choose to evaluate it for x running on the curve 
({x,y) = Z and y ranging from to yo = 1 — Z. The coordinates x of the 
points on the curve ({x,y) = Z have the form 

as can be seen by inversion of definition (|5^). Therefore 

dxdy = / dy I dx = (l — x{y, Z))dy. (63) 

lD{Z)\R{Z) Jo Jx{y,Z) Jo 
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We thus need to calculate the integral 



// ^^^y= r |l- fJ"/., \ ]dy. (64) 

JJd(z)\r(z) Jo \ y{z + 



ID{Z)\R{Z) 

This is a quite simple task, and the result is 



^^^^ = TT^W (1 - - 2^ log Z) . (65) 



We finally obtain F[Z) from identity (|6l|): 

2Z 

F{Z) = + ^(1 - log^)). (66) 

Differentiating this with respect to Z we get at last Result 7: 



5 Conclusion 

To summarize we have studied an abstract evolutionary model in which the 
system's size is and phase space V has 2^ configurations. The evolutionary 
rule Fp is a stochastic map that depends on a real parameter p e [0, 1]. For 
p = 1 we recover Kauffman's maximally rugged model and trajectories to 
local fitness optima are adaptive walks. For generic p we have introduced 
p-walks. In the finite N case we have shown that the average length of a 
p-walk as estimated by Result 5 is finite whenever p > pc{N), where the 
critical value p is given by Pc{N) = 1 — 1/(A^ — 1). When p pc{N)~^ and 
for all p < Pc{N) the average length diverges. This results in a percolation- 
like phase transition. In the supercritical phase {p > Pc{N)) all p- walks are of 
finite length, whereas in the subcritical phase (p < Pc{N)) the probability of 
an infinitely long p-walk is non zero. In the thermodynamic limit — oo we 
have emphasized the fact that the dimension of the space Q of p-walks must 
be considered infinite, fl contains all representations of p-walks of a given 
length, hence the thermodynamic limit yields a divergence in the number of 
different possible evolutions. We have shown that Q may be partitioned in 
infinite subsets grouping "similar" p-walks. This fragmentation is analogous 
to that of a geometrically broken object. Hence, we were able to prove that 
non self- averaging effects are present: the probability Y that two p- walkers 
with same value of p have similar evolutions has non zero average and finite 
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variance, even though the number of different types of evolutions is infinite. 
Lastly, we have studied the probability Z that two different p-walkers (with 
different values of p) have similar evolutions and have shown that Z is also 
non self- averaging. The simplicity of the model has made it possible to obtain 
analytical results in the thermodynamic limit for both Y and Z . 

These results deserve some comment. The p-walks' model seems to be 
versatile for different metaphoric interpretations, mostly because of its simple 
definition. Yet, it has turned out to display a rich and non-trivial behaviour 
even in the thermodynamic limit. It represents another non self- averaging 
model, adding to a list which indicates the strong need to find a more gen- 
eral theory, or at least the universality underlying the presence of this phe- 
nomenon in many different contexts. We have also stressed in the introduc- 
tion that we have worked out this model as a model of an abstract behaviour. 
Nevertheless, a comparison with biological evolutionary models is possible. 
Ref. 0] offers a detailed account on the biological side of non self-averaging 
effects. Interestingly, such quantities as Y in abstract disordered models are 
measurable quantities for biological systems. More precisely in population 
genetics Y corresponds to a parameter called homozygosity, giving the prob- 
ability that two genes sampled randomly at the same locus in two individuals 
are identical. It is an experimental fact, as is explained in Ref. 0], that Y has 
a broad distribution for a large number of polymorphic loci in Drosophila. 
This can be a convincing evidence of the fact that the evolutionary process 
is non self-averaging. From this viewpoint, we think our model shows that 
a less strict dynamical rule is necessary for non self-averaging effects to ap- 
pear in a Kauffman-type of model. If we are in a tightly adaptive situation 
two systems undergoing biological evolution will always be doing the same 
type of walk, which would mean Y = 1 with a trivial distribution. On the 
contrary, if a certain variability is allowed, the probability Y that the two 
systems find themselves in similar states is still not zero, on the average, but 
its distribution is broad and non trivial. This kind of evolution sounds to 
be closer to that implied by the experimental results on Drosophila. Note 
that the existence of a variability in the rule implies a non zero probability of 
failure, which in our model is the very feature leading to non self-averaging 
effects. 
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